NetNews Offline 2

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Offline 2 / NetNews Offline Volume 2.iso / news / comp / lang / c-part2 / 10052 < prev next >

Wrap

Internet Message Format | 1996-08-05 | 1.9 KB

Path: anvil.ugrad.cs.ubc.ca!not-for-mail From: c2a192@ugrad.cs.ubc.ca (Kazimir Kylheku) Newsgroups: comp.lang.c Subject: Re: C uses simple LALR parsing? Date: 14 Mar 1996 17:18:13 -0800 Organization: Computer Science, University of B.C., Vancouver, B.C., Canada Message-ID: <4iagglINNdu7@anvil.ugrad.cs.ubc.ca> References: <4i9ilq$92a@sargas.omicron.se> NNTP-Posting-Host: anvil.ugrad.cs.ubc.ca In article <4i9ilq$92a@sargas.omicron.se>, Elias Martenson <elias@omicron.se> wrote: >This question has been subject to major debate among me and my firends: > >Does the C standard specify what kind of parser to use? If it uses >(and is does seem that at least AT&T C and GCC does) a simple >LALR parser it would mean that the following expression: > > a+++++b > >Yields: > > SYMBOL INCR INCR ADD SYMBOL > >Instead of the desired result (which is acieved when doing a++ + ++b): > > SYMBOL INCR ADD INCR SYNBOL > >Is it legal for a C compiler to parse the expression like the last >example? The distinction you are describing has nothing to do with LALR(1) parsing, and everything with _lexical analysis_. Lexical analysis is based on regular expressions, which are weaker than LR grammars (a regular expression is basically a grammar which is only right recursive---a non terminal symbol never appears between two terminals in any production), hence it cannot match things like nested parentheses. The answer to your question is, yes, the C standard lexical rules are indeed unambigous. The expression 'a+++++b' will be tokenized as 'a ++ ++ + b'. The rule used is simple: extract from the input the longest lexeme that matches the regular expression pattern. When faced with '+++++++', the lexical analyzer will extract '++' rather than '+'. Both are matching tokens, but the first one is longer. The compiler will not figure out the alternate scan just to eliminate the syntax error. --